Throughout the history of humankind, vaccines have helped us overcome pandemic situations. In light of the highly contagious nature of coronavirus, which has affected populations worldwide, vaccines are the most effective and vital tool for public health protection. The novel coronavirus, which started in a fish market in Wuhan, Hubei Province, China, in December 2019, spread worldwide, causing COVID-19 disease in millions of people. A high human-to-human transmission rate of COVID-19 led the World Health Organization (WHO) to declare COVID-19 a public health emergency on March 11, 2020. Vaccines to combat COVID-19’s spread were desperately needed to reduce morbidity and mortality caused by COVID-19. In December 2020, several vaccines to prevent COVID-19 infection were approved, while several others were still being produced. A vaccine is developed through a lengthy process involving numerous testing phases to ensure adequate safety and immunogenicity in various people. The hunt to create new and more effective vaccines is ongoing, and they will likely emerge as the pandemic proceeds. In this article, we try to visualize the inequalities which lies beneath the vaccination drive going on worldwide.
In this article, we have used the Covid-19 World Vaccination Progress dataset available at Kaggle, which maintains the Daily and Total vaccinations record for COVID-19 worldwide. The table below specifies the dataset’s contents, which contains 15 different columns to track the progress of global vaccination of COVID-19 worldwide. These 15 columns contain 9 decimals, 3 strings, 1 country, and 2 other types of data present in them.
We have another dataset available in the file from which we downloaded our previous dataset, this dataset gives the list of vaccines by different manufacturers which are being used worldwide in different countries. This dataset has 4 columns; the details of these columns are specified in the below-given table.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib
%matplotlib inline
import plotly
plotly.offline.init_notebook_mode()
import plotly.express as px
sns.set_style('darkgrid')
matplotlib.rcParams['font.size'] = 12
matplotlib.rcParams['figure.facecolor'] = '#00000000'
matplotlib.rcParams['figure.figsize'] = (10, 6)
plt.rc('font', size=11)
df1=pd.read_csv('country_vaccinations.csv')
df2=pd.read_csv('country_vaccinations_by_manufacturer.csv')
df1.head(10)
| country | iso_code | date | total_vaccinations | people_vaccinated | people_fully_vaccinated | daily_vaccinations_raw | daily_vaccinations | total_vaccinations_per_hundred | people_vaccinated_per_hundred | people_fully_vaccinated_per_hundred | daily_vaccinations_per_million | vaccines | source_name | source_website | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Afghanistan | AFG | 2021-02-22 | 0.0 | 0.0 | NaN | NaN | NaN | 0.00 | 0.00 | NaN | NaN | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
| 1 | Afghanistan | AFG | 2021-02-23 | NaN | NaN | NaN | NaN | 1367.0 | NaN | NaN | NaN | 34.0 | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
| 2 | Afghanistan | AFG | 2021-02-24 | NaN | NaN | NaN | NaN | 1367.0 | NaN | NaN | NaN | 34.0 | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
| 3 | Afghanistan | AFG | 2021-02-25 | NaN | NaN | NaN | NaN | 1367.0 | NaN | NaN | NaN | 34.0 | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
| 4 | Afghanistan | AFG | 2021-02-26 | NaN | NaN | NaN | NaN | 1367.0 | NaN | NaN | NaN | 34.0 | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
| 5 | Afghanistan | AFG | 2021-02-27 | NaN | NaN | NaN | NaN | 1367.0 | NaN | NaN | NaN | 34.0 | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
| 6 | Afghanistan | AFG | 2021-02-28 | 8200.0 | 8200.0 | NaN | NaN | 1367.0 | 0.02 | 0.02 | NaN | 34.0 | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
| 7 | Afghanistan | AFG | 2021-03-01 | NaN | NaN | NaN | NaN | 1580.0 | NaN | NaN | NaN | 40.0 | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
| 8 | Afghanistan | AFG | 2021-03-02 | NaN | NaN | NaN | NaN | 1794.0 | NaN | NaN | NaN | 45.0 | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
| 9 | Afghanistan | AFG | 2021-03-03 | NaN | NaN | NaN | NaN | 2008.0 | NaN | NaN | NaN | 50.0 | Johnson&Johnson, Oxford/AstraZeneca, Pfizer/Bi... | World Health Organization | https://covid19.who.int/ |
df2.head(10)
| location | date | vaccine | total_vaccinations | |
|---|---|---|---|---|
| 0 | Argentina | 2020-12-29 | Moderna | 2 |
| 1 | Argentina | 2020-12-29 | Oxford/AstraZeneca | 3 |
| 2 | Argentina | 2020-12-29 | Sinopharm/Beijing | 1 |
| 3 | Argentina | 2020-12-29 | Sputnik V | 20481 |
| 4 | Argentina | 2020-12-30 | Moderna | 2 |
| 5 | Argentina | 2020-12-30 | Oxford/AstraZeneca | 3 |
| 6 | Argentina | 2020-12-30 | Sinopharm/Beijing | 1 |
| 7 | Argentina | 2020-12-30 | Sputnik V | 40583 |
| 8 | Argentina | 2020-12-31 | Moderna | 2 |
| 9 | Argentina | 2020-12-31 | Oxford/AstraZeneca | 3 |
Before beginning to find insights from the data, cleaning and preparing it as per our requirements is one of the most important properties of Exploratory Data Analysis(EDA). It is the most important step toward an effective data analysis. In our dataset, there were some ‘NaN’ values, and some empty rows, along with some redundant columns, which is obviously expected after knowing the fact about all the inconsistencies which were there initially in collecting the data during the covid times. By using the df.fillna and df.drop functions from the pandas library, we changed the ‘NaN’ values to 0 and removed the rows which weren’t required.
For analyzing the dataset, we have used different exploratory data analysis techniques and even visualized those outcomes to provide an analysis of different ongoing vaccination programs around the world. For data ingestion, visualization, and analysis, we initialized several Python packages, including NumPy (http://numpy.org/), Pandas (http://pandas.pydata.org/), Matplotlib (https://matplotlib.org/), Seaborn (https://seaborn.pydata.org/), and Plotly (https://plotly.com/).
Based on our analysis of the given dataset we have came up with the following results:
Globally, countries rapidly began to vaccinate their citizens after WHO approved several vaccines in December 2021, which indicated the need for vaccines to prevent Coronavirus outbreaks. The total vaccinations in any country depict the effectiveness of the measures adopted by the government to catalyze the vaccination drive in their country.
#Top 10 Countries according to the total vaccination.
data=df1.groupby('country').sum().sort_values(by='total_vaccinations',ascending=False).head(20)
fig,ax=plt.subplots(figsize =(18, 10))
ax.barh(data.index,data['total_vaccinations'])
ax.grid(b = True, color ='grey',linestyle ='-.', linewidth = 0.9,alpha = 0.2)
# Show top values
ax.invert_yaxis()
ax.set_title('Top 10 countries according to the total vaccinations',loc ='center', )
plt.xlabel("Total Vaccinations(In million)")
plt.show()
As visible from the above diagram, countries like China, India, United States, and Brazil have succeeded well in promoting the vaccination drive in their countries. This success has helped them hinder the spread and the severity of symptoms in people due to the coronavirus to a certain extent. For densely populated countries like China and India, vaccinating their entire population was further difficult as well as highly important. However, though China and India stand tall in the data of total vaccinations, the reality is that these countries lag behind when we compare them with other countries in terms of number of people vaccinated per hundred in the country. The below-given diagram depicts this scenario:
temp=df1.loc[df1['date']<='2021-01-12'].sort_values(by='people_vaccinated_per_hundred',ascending=False).groupby('country').first()
data2=temp.sort_values(by='people_vaccinated_per_hundred',ascending=False).head(10)
fig2 = px.bar(data2, x=data2.index, y='people_vaccinated_per_hundred',hover_data=['total_vaccinations'], color='people_vaccinated_per_hundred',title='Percentage of population vaccinated', height=400)
fig2.show()
Based on the graph below, we can clearly see that Israel is leading the trend with more than 22 people vaccinated per hundred people, following it is UAE which has vaccinated almost 9 people per hundred people, and a small country like Gibraltar is also in the list ahead of developed countries like United States and United Kingdom. China and India lag far behind in this list in the 58th and 75th positions, respectively.
The following given figure depicts the number of people who are partially and fully vaccinated in the country as of 24-03-2022:
temp1=df1.sort_values(by='people_fully_vaccinated',ascending=False).groupby('country').first()
data3=temp1.sort_values(by='people_fully_vaccinated',ascending=False).head(10)
data3['people_partially_vaccinated']=data3['people_vaccinated']-data3['people_fully_vaccinated']
fig3 = px.bar(data3, x=data3.index, y=["people_partially_vaccinated", "people_fully_vaccinated"], title="People fully and partially vaccinated in the country",labels={'Country':'People vaccinated'},height=800)
fig3.show()
Again, China and India lead this tally by fully vaccinating most of their vaccinated population. However, this data depicts an inequity when viewed along with the graph given below:
temp2=df1.loc[df1['total_vaccinations']>10000].groupby('country').last()
data4=temp2.sort_values(by='total_vaccinations',ascending=True).head(10)
fig4 = px.bar(data4, x=data4.index, y='total_vaccinations',title='Bottom 10 countries according to the total vaccinations')
fig4.show()
The above graph shows the bottom 10 countries in terms of total vaccination. All the countries listed above are poor in terms of their economic conditions and hence are not capable enough of purchasing vaccines rapidly, which is why they lack behind in vaccinating their citizens. This inequity, where the countries with better economic conditions are fully vaccinating their citizens, while the poor countries are longing for vaccines, shows the inequality in the vaccination drive going on worldwide. WHO and countries like India, United States, United Kingdom, and many more have identified this problem and have moved ahead to help these poor countries by donating their vaccines.
25 different vaccines are being used in various countries all over the globe. The pie chart representation given below shows all the available vaccines used per million. The vaccine from Pfizer tops this list and is the most used COVID-19 vaccine across the world. It is followed by the vaccine from Moderna and then by the vaccine from Oxford-AstraZeneca. Most of China's citizens are vaccinated with Sinovac, while most of India's citizens are vaccinated with Covaxin or Covishield.
temp3=df2[['vaccine','total_vaccinations']]
agg_functions1 = {'vaccine': 'first', 'total_vaccinations': 'sum'}
#create new DataFrame by combining rows with same id values
data5 = df2.groupby(df2['vaccine']).aggregate(agg_functions1)
fig5=plt.figure(figsize=(45,30))
fig5=px.pie(data5,names=data5.index,values='total_vaccinations',title='Percentage of usage of each categories of vaccine',hole=0.2)
fig5.update_traces(textposition='inside')
fig5.show()
<Figure size 3240x2160 with 0 Axes>
We looked closer at Pfizer's success and tried to explore why it outperformed the rest of the industry, and the answer is its initiatives to reduce inequalities beneath the vaccination drive carried out worldwide. Pfizer has strived to reach even the poorest countries across the globe. Its cheaper price has made it accessible to the citizens of countries whose economic conditions were not that good.
fig6 = px.choropleth(df1, locations="iso_code",hover_name="country",color='vaccines', color_continuous_scale=px.colors.sequential.Rainbow,title= "Vaccines used by different countries")
fig6.update_layout(showlegend=True)
fig6.show()
Above is a choropleth that represents the spatial variation in vaccines used by the different countries around the world. In accordance with the legend, different colors indicate different vaccines used in a particular country. Additionally, we have added a hover feature that displays the country name when interacting with the graph.
We can draw a conclusion from the data studies above, that vaccination inequality exists around the globe. The existence of this inequality may not be apparent at first glance, but it does exist. Covid has hit us all relentlessly with equal intensity: it has affected everyone everywhere in the same way. However, our fight back against it has not been uniform. The richer countries were able to fight back better because they had access to large-scale vaccination campaigns, and thus were able to fight well with the spread of covid-19 in a much better way than the third-world countries, which still have widespread disease. Several nations, including India, United States, Russia, United Kingdom, and many more, have extended a helping hand to these third-world countries. Even vaccine manufacturer companies like Pfizer are taking initiatives to make vaccines available to these poor countries. But more needs to be done. Our world is going through a challenging time when we need to be together as humans rather than diverge as countries or ethnicities.